Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
1.
Nat Metab ; 6(3): 550-566, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38448615

RESUMEN

The post-translational modification lysine succinylation is implicated in the regulation of various metabolic pathways. However, its biological relevance remains uncertain due to methodological difficulties in determining high-impact succinylation sites. Here, using stable isotope labelling and data-independent acquisition mass spectrometry, we quantified lysine succinylation stoichiometries in mouse livers. Despite the low overall stoichiometry of lysine succinylation, several high-stoichiometry sites were identified, especially upon deletion of the desuccinylase SIRT5. In particular, multiple high-stoichiometry lysine sites identified in argininosuccinate synthase (ASS1), a key enzyme in the urea cycle, are regulated by SIRT5. Mutation of the high-stoichiometry lysine in ASS1 to succinyl-mimetic glutamic acid significantly decreased its enzymatic activity. Metabolomics profiling confirms that SIRT5 deficiency decreases urea cycle activity in liver. Importantly, SIRT5 deficiency compromises ammonia tolerance, which can be reversed by the overexpression of wild-type, but not succinyl-mimetic, ASS1. Therefore, lysine succinylation is functionally important in ammonia metabolism.


Asunto(s)
Lisina , Sirtuinas , Ratones , Animales , Lisina/química , Lisina/metabolismo , Amoníaco , Sirtuinas/metabolismo , Ratones Noqueados , Urea
2.
bioRxiv ; 2024 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-38370692

RESUMEN

Non-invasive detection of protein biomarkers in plasma is crucial for clinical purposes. Liquid chromatography mass spectrometry (LC-MS) is the gold standard technique for plasma proteome analysis, but despite recent advances, it remains limited by throughput, cost, and coverage. Here, we introduce a new hybrid method, which integrates direct infusion shotgun proteome analysis (DISPA) with nanoparticle (NP) protein coronas enrichment for high throughput and efficient plasma proteomic profiling. We realized over 280 protein identifications in 1.4 minutes collection time, which enables a potential throughput of approximately 1,000 samples daily. The identified proteins are involved in valuable pathways and 44 of the proteins are FDA approved biomarkers. The robustness and quantitative accuracy of this method were evaluated across multiple NPs and concentrations with a mean coefficient of variation at 17%. Moreover, different protein corona profiles were observed among various nanoparticles based on their distinct surface modifications, and all NP protein profiles exhibited deeper coverage and better quantification than neat plasma. Our streamlined workflow merges coverage and throughput with precise quantification, leveraging both DISPA and NP protein corona enrichments. This underscores the significant potential of DISPA when paired with NP sample preparation techniques for plasma proteome studies.

3.
ArXiv ; 2023 Nov 13.
Artículo en Inglés | MEDLINE | ID: mdl-38013887

RESUMEN

Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification. "Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptides that are analyzed by mass spectrometry. Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, protein structural alterations, absolute and relative protein quantification, post-translational modifications, and protein stability. To enable this range of different experiments, there are diverse strategies for proteome analysis. The nuances of how proteomic workflows differ may be challenging to understand for new practitioners. Here, we provide a comprehensive overview of different proteomics methods to aid the novice and experienced researcher. We cover from biochemistry basics and protein extraction to biological interpretation and orthogonal validation. We expect this work to serve as a basic resource for new practitioners in the field of shotgun or bottom-up proteomics.

4.
iScience ; 26(10): 107785, 2023 Oct 20.
Artículo en Inglés | MEDLINE | ID: mdl-37727736

RESUMEN

Non-typeable Haemophilus influenzae (NTHi) causes millions of infections each year. Though it is primarily known to cause otitis media, recent studies have shown NTHi is emerging as a primary pathogen for invasive infection, prompting the need for new vaccines and treatments. Lipooligosaccharide (LOS) has been identified as a potential vaccine candidate due to its immunogenic nature and outer membrane localization. Yet, phase variable expression of genes involved in LOS synthesis has complicated vaccine development. In this study, we used a chinchilla model of otitis media to investigate how phase variation of oafA, a gene involved in LOS biosynthesis, affects antibody production in response to infection. We found that acetylation of LOS by OafA inhibited production of LOS-specific antibodies during infection and that NTHi expressing acetylated LOS were subsequently better protected against opsonophagocytic killing. These findings highlight the importance of understanding how phase variable modifications might affect vaccine efficacy and success.

5.
BioData Min ; 16(1): 20, 2023 Jul 13.
Artículo en Inglés | MEDLINE | ID: mdl-37443040

RESUMEN

The introduction of large language models (LLMs) that allow iterative "chat" in late 2022 is a paradigm shift that enables generation of text often indistinguishable from that written by humans. LLM-based chatbots have immense potential to improve academic work efficiency, but the ethical implications of their fair use and inherent bias must be considered. In this editorial, we discuss this technology from the academic's perspective with regard to its limitations and utility for academic writing, education, and programming. We end with our stance with regard to using LLMs and chatbots in academia, which is summarized as (1) we must find ways to effectively use them, (2) their use does not constitute plagiarism (although they may produce plagiarized text), (3) we must quantify their bias, (4) users must be cautious of their poor accuracy, and (5) the future is bright for their application to research and as an academic tool.

6.
bioRxiv ; 2023 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-37425781

RESUMEN

Combined multi-omics analysis of proteomics, polar metabolomics, and lipidomics requires separate liquid chromatography-mass spectrometry (LC-MS) platforms for each omics layer. This requirement for different platforms limits throughput and increases costs, preventing the application of mass spectrometry-based multi-omics to large scale drug discovery or clinical cohorts. Here, we present an innovative strategy for simultaneous multi-omics analysis by direct infusion (SMAD) using one single injection without liquid chromatography. SMAD allows quantification of over 9,000 metabolite m/z features and over 1,300 proteins from the same sample in less than five minutes. We validated the efficiency and reliability of this method and then present two practical applications: mouse macrophage M1/M2 polarization and high throughput drug screening in human 293T cells. Finally, we demonstrate relationships between proteomic and metabolomic data are discovered by machine learning.

7.
J Am Soc Mass Spectrom ; 34(9): 1858-1867, 2023 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-37463334

RESUMEN

Skeletal muscle is a major regulatory tissue of whole-body metabolism and is composed of a diverse mixture of cell (fiber) types. Aging and several diseases differentially affect the various fiber types, and therefore, investigating the changes in the proteome in a fiber-type specific manner is essential. Recent breakthroughs in isolated single muscle fiber proteomics have started to reveal heterogeneity among fibers. However, existing procedures are slow and laborious, requiring 2 h of mass spectrometry time per single muscle fiber; 50 fibers would take approximately 4 days to analyze. Thus, to capture the high variability in fibers both within and between individuals requires advancements in high throughput single muscle fiber proteomics. Here we use a single cell proteomics method to enable quantification of single muscle fiber proteomes in 15 min total instrument time. As proof of concept, we present data from 53 isolated skeletal muscle fibers obtained from two healthy individuals analyzed in 13.25 h. Adapting single cell data analysis techniques to integrate the data, we can reliably separate type 1 and 2A fibers. Ninety-four proteins were statistically different between clusters indicating alteration of proteins involved in fatty acid oxidation, oxidative phosphorylation, and muscle structure and contractile function. Our results indicate that this method is significantly faster than prior single fiber methods in both data collection and sample preparation while maintaining sufficient proteome depth. We anticipate this assay will enable future studies of single muscle fibers across hundreds of individuals, which has not been possible previously due to limitations in throughput.


Asunto(s)
Proteoma , Proteómica , Humanos , Proteoma/metabolismo , Proteómica/métodos , Flujo de Trabajo , Fibras Musculares Esqueléticas/metabolismo , Músculo Esquelético
8.
Anal Chem ; 95(24): 9145-9150, 2023 06 20.
Artículo en Inglés | MEDLINE | ID: mdl-37289937

RESUMEN

Identification and proteomic characterization of rare cell types within complex organ-derived cell mixtures is best accomplished by label-free quantitative mass spectrometry. High throughput is required to rapidly survey hundreds to thousands of individual cells to adequately represent rare populations. Here we present parallelized nanoflow dual-trap single-column liquid chromatography (nanoDTSC) operating at 15 min of total run time per cell with peptides quantified over 11.5 min using standard commercial components, thus offering an accessible and efficient LC solution to analyze 96 single cells per day. At this throughput, nanoDTSC quantified over 1000 proteins in individual cardiomyocytes and heterogeneous populations of single cells from the aorta.


Asunto(s)
Proteínas , Proteómica , Proteómica/métodos , Cromatografía Liquida/métodos , Proteínas/química , Péptidos/química , Espectrometría de Masas/métodos
9.
bioRxiv ; 2023 Oct 23.
Artículo en Inglés | MEDLINE | ID: mdl-37162892

RESUMEN

Background: Descending thoracic aortic aneurysms and dissections can go undetected until severe and catastrophic, and few clinical indices exist to screen for aneurysms or predict risk of dissection. Methods: This study generated a plasma proteomic dataset from 75 patients with descending type B dissection (Type B) and 62 patients with descending thoracic aortic aneurysm (DTAA). Standard statistical approaches were compared to supervised machine learning (ML) algorithms to distinguish Type B from DTAA cases. Quantitatively similar proteins were clustered based on linkage distance from hierarchical clustering and ML models were trained with uncorrelated protein lists across various linkage distances with hyperparameter optimization using 5-fold cross validation. Permutation importance (PI) was used for ranking the most important predictor proteins of ML classification between disease states and the proteins among the top 10 PI protein groups were submitted for pathway analysis. Results: Of the 1,549 peptides and 198 proteins used in this study, no peptides and only one protein, hemopexin (HPX), were significantly different at an adjusted p-value <0.01 between Type B and DTAA cases. The highest performing model on the training set (Support Vector Classifier) and its corresponding linkage distance (0.5) were used for evaluation of the test set, yielding a precision-recall area under the curve of 0.7 to classify between Type B from DTAA cases. The five proteins with the highest PI scores were immunoglobulin heavy variable 6-1 (IGHV6-1), lecithin-cholesterol acyltransferase (LCAT), coagulation factor 12 (F12), HPX, and immunoglobulin heavy variable 4-4 (IGHV4-4). All proteins from the top 10 most important correlated groups generated the following significantly enriched pathways in the plasma of Type B versus DTAA patients: complement activation, humoral immune response, and blood coagulation. Conclusions: We conclude that ML may be useful in differentiating the plasma proteome of highly similar disease states that would otherwise not be distinguishable using statistics, and, in such cases, ML may enable prioritizing important proteins for model prediction.

10.
Nat Biotechnol ; 41(12): 1776-1786, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36959352

RESUMEN

An average shotgun proteomics experiment detects approximately 10,000 human proteins from a single sample. However, individual proteins are typically identified by peptide sequences representing a small fraction of their total amino acids. Hence, an average shotgun experiment fails to distinguish different protein variants and isoforms. Deeper proteome sequencing is therefore required for the global discovery of protein isoforms. Using six different human cell lines, six proteases, deep fractionation and three tandem mass spectrometry fragmentation methods, we identify a million unique peptides from 17,717 protein groups, with a median sequence coverage of approximately 80%. Direct comparison with RNA expression data provides evidence for the translation of most nonsynonymous variants. We have also hypothesized that undetected variants likely arise from mutation-induced protein instability. We further observe comparable detection rates for exon-exon junction peptides representing constitutive and alternative splicing events. Our dataset represents a resource for proteoform discovery and provides direct evidence that most frame-preserving alternatively spliced isoforms are translated.


Asunto(s)
Empalme Alternativo , Proteoma , Humanos , Proteoma/genética , Proteoma/metabolismo , Isoformas de Proteínas/genética , Empalme Alternativo/genética , Péptidos/química , Secuencia de Aminoácidos
11.
bioRxiv ; 2023 Feb 23.
Artículo en Inglés | MEDLINE | ID: mdl-36865126

RESUMEN

Skeletal muscle is a major regulatory tissue of whole-body metabolism and is composed of a diverse mixture of cell (fiber) types. Aging and several diseases differentially affect the various fiber types, and therefore, investigating the changes in the proteome in a fiber-type specific manner is essential. Recent breakthroughs in isolated single muscle fiber proteomics have started to reveal heterogeneity among fibers. However, existing procedures are slow and laborious requiring two hours of mass spectrometry time per single muscle fiber; 50 fibers would take approximately four days to analyze. Thus, to capture the high variability in fibers both within and between individuals requires advancements in high throughput single muscle fiber proteomics. Here we use a single cell proteomics method to enable quantification of single muscle fiber proteomes in 15 minutes total instrument time. As proof of concept, we present data from 53 isolated skeletal muscle fibers obtained from two healthy individuals analyzed in 13.25 hours. Adapting single cell data analysis techniques to integrate the data, we can reliably separate type 1 and 2A fibers. Sixty-five proteins were statistically different between clusters indicating alteration of proteins involved in fatty acid oxidation, muscle structure and regulation. Our results indicate that this method is significantly faster than prior single fiber methods in both data collection and sample preparation while maintaining sufficient proteome depth. We anticipate this assay will enable future studies of single muscle fibers across hundreds of individuals, which has not been possible previously due to limitations in throughput.

12.
J Hepatol ; 79(1): 25-42, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-36822479

RESUMEN

BACKGROUND & AIMS: The consumption of sugar and a high-fat diet (HFD) promotes the development of obesity and metabolic dysfunction. Despite their well-known synergy, the mechanisms by which sugar worsens the outcomes associated with a HFD are largely elusive. METHODS: Six-week-old, male, C57Bl/6 J mice were fed either chow or a HFD and were provided with regular, fructose- or glucose-sweetened water. Moreover, cultured AML12 hepatocytes were engineered to overexpress ketohexokinase-C (KHK-C) using a lentivirus vector, while CRISPR-Cas9 was used to knockdown CPT1α. The cell culture experiments were complemented with in vivo studies using mice with hepatic overexpression of KHK-C and in mice with liver-specific CPT1α knockout. We used comprehensive metabolomics, electron microscopy, mitochondrial substrate phenotyping, proteomics and acetylome analysis to investigate underlying mechanisms. RESULTS: Fructose supplementation in mice fed normal chow and fructose or glucose supplementation in mice fed a HFD increase KHK-C, an enzyme that catalyzes the first step of fructolysis. Elevated KHK-C is associated with an increase in lipogenic proteins, such as ACLY, without affecting their mRNA expression. An increase in KHK-C also correlates with acetylation of CPT1α at K508, and lower CPT1α protein in vivo. In vitro, KHK-C overexpression lowers CPT1α and increases triglyceride accumulation. The effects of KHK-C are, in part, replicated by a knockdown of CPT1α. An increase in KHK-C correlates negatively with CPT1α protein levels in mice fed sugar and a HFD, but also in genetically obese db/db and lipodystrophic FIRKO mice. Mechanistically, overexpression of KHK-C in vitro increases global protein acetylation and decreases levels of the major cytoplasmic deacetylase, SIRT2. CONCLUSIONS: KHK-C-induced acetylation is a novel mechanism by which dietary fructose augments lipogenesis and decreases fatty acid oxidation to promote the development of metabolic complications. IMPACT AND IMPLICATIONS: Fructose is a highly lipogenic nutrient whose negative consequences have been largely attributed to increased de novo lipogenesis. Herein, we show that fructose upregulates ketohexokinase, which in turn modifies global protein acetylation, including acetylation of CPT1a, to decrease fatty acid oxidation. Our findings broaden the impact of dietary sugar beyond its lipogenic role and have implications on drug development aimed at reducing the harmful effects attributed to sugar metabolism.


Asunto(s)
Carnitina O-Palmitoiltransferasa , Hígado , Masculino , Ratones , Animales , Carnitina O-Palmitoiltransferasa/genética , Carnitina O-Palmitoiltransferasa/metabolismo , Carnitina O-Palmitoiltransferasa/farmacología , Acetilación , Hígado/metabolismo , Obesidad/metabolismo , Glucosa/metabolismo , Dieta Alta en Grasa/efectos adversos , Ácidos Grasos/metabolismo , Fructosa/metabolismo , Fructoquinasas/genética , Fructoquinasas/metabolismo
13.
bioRxiv ; 2023 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-36711540

RESUMEN

Identification and proteomic characterization of rare cell types within complex organ derived cell mixtures is best accomplished by label-free quantitative mass spectrometry. High throughput is required to rapidly survey hundreds to thousands of individual cells to adequately represent rare populations. Here we present parallelized nanoflow dual-trap single-column liquid chromatography (nanoDTSC) operating at 15 minutes of total run time per cell with peptides quantified over 11.5 minutes using standard commercial components, thus offering an accessible and efficient LC solution to analyze 96 single-cells per day. At this throughput, nanoDTSC quantified over 1,000 proteins in individual cardiomyocytes and heterogenous populations of single cells from aorta.

14.
Anal Chem ; 95(2): 677-685, 2023 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-36527718

RESUMEN

Large-scale proteome analysis requires rapid and high-throughput analytical methods. We recently reported a new paradigm in proteome analysis where direct infusion and ion mobility are used instead of liquid chromatography (LC) to achieve rapid and high-throughput proteome analysis. Here, we introduce an improved direct infusion shotgun proteome analysis protocol including label-free quantification (DISPA-LFQ) using CsoDIAq software. With CsoDIAq analysis of DISPA data, we can now identify up to ∼2000 proteins from the HeLa and 293T proteomes, and with DISPA-LFQ, we can quantify ∼1000 proteins from no more than 1 µg of sample within minutes. The identified proteins are involved in numerous valuable pathways including central carbon metabolism, nucleic acid replication and transport, protein synthesis, and endocytosis. Together with a high-throughput sample preparation method in a 96-well plate, we further demonstrate the utility of this technology for performing high-throughput drug analysis in human 293T cells. The total time for data collection from a whole 96-well plate is approximately 8 h. We conclude that the DISPA-LFQ strategy presents a valuable tool for fast identification and quantification of proteins in complex mixtures, which will power a high-throughput proteomic era of drug screening, biomarker discovery, and clinical analysis.


Asunto(s)
Proteoma , Proteómica , Humanos , Proteoma/análisis , Proteómica/métodos , Cromatografía Liquida/métodos , Programas Informáticos
15.
Bioinformatics ; 38(21): 4908-4918, 2022 10 31.
Artículo en Inglés | MEDLINE | ID: mdl-36106996

RESUMEN

MOTIVATION: Cells respond to environments by regulating gene expression to exploit resources optimally. Recent advances in technologies allow for measuring the abundances of RNA, proteins, lipids and metabolites. These highly complex datasets reflect the states of the different layers in a biological system. Multi-omics is the integration of these disparate methods and data to gain a clearer picture of the biological state. Multi-omic studies of the proteome and metabolome are becoming more common as mass spectrometry technology continues to be democratized. However, knowledge extraction through the integration of these data remains challenging. RESULTS: Connections between molecules in different omic layers were discovered through a combination of machine learning and model interpretation. Discovered connections reflected protein control (ProC) over metabolites. Proteins discovered to control citrate were mapped onto known genetic and metabolic networks, revealing that these protein regulators are novel. Further, clustering the magnitudes of ProC over all metabolites enabled the prediction of five gene functions, each of which was validated experimentally. Two uncharacterized genes, YJR120W and YDL157C, were accurately predicted to modulate mitochondrial translation. Functions for three incompletely characterized genes were also predicted and validated, including SDH9, ISC1 and FMP52. A website enables results exploration and also MIMaL analysis of user-supplied multi-omic data. AVAILABILITY AND IMPLEMENTATION: The website for MIMaL is at https://mimal.app. Code for the website is at https://github.com/qdickinson/mimal-website. Code to implement MIMaL is at https://github.com/jessegmeyerlab/MIMaL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Aprendizaje Automático , Redes y Vías Metabólicas , Análisis por Conglomerados , Proteoma
16.
JAMIA Open ; 5(3): ooac063, 2022 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-35958671

RESUMEN

Objective: The rate of diabetic complication progression varies across individuals and understanding factors that alter the rate of complication progression may uncover new clinical interventions for personalized diabetes management. Materials and Methods: We explore how various machine learning (ML) models and types of electronic health records (EHRs) can predict fast versus slow onset of neuropathy, nephropathy, ocular disease, or cardiovascular disease using only patient data collected prior to diabetes diagnosis. Results: We find that optimized random forest models performed best to accurately predict the diagnosis of a diabetic complication, with the most effective model distinguishing between fast versus slow nephropathy (AUROC = 0.75). Using all data sets combined allowed for the highest model predictive performance, and social history or laboratory alone were most predictive. SHapley Additive exPlanations (SHAP) model interpretation allowed for exploration of predictors of fast and slow complication diagnosis, including underlying biases present in the EHR. Patients in the fast group had more medical visits, incurring a potential informed decision bias. Discussion: Our study is unique in the realm of ML studies as it leverages SHAP as a starting point to explore patient markers not routinely used in diabetes monitoring. A mix of both bias and biological processes is likely present in influencing a model's ability to distinguish between groups. Conclusion: Overall, model interpretation is a critical step in evaluating validity of a user-intended endpoint for a model when using EHR data, and predictors affected by bias and those driven by biologic processes should be equally recognized.

17.
Anal Chem ; 94(36): 12452-12460, 2022 09 13.
Artículo en Inglés | MEDLINE | ID: mdl-36044770

RESUMEN

Proteomic analysis on the scale that captures population and biological heterogeneity over hundreds to thousands of samples requires rapid mass spectrometry methods, which maximize instrument utilization (IU) and proteome coverage while maintaining precise and reproducible quantification. To achieve this, a short liquid chromatography gradient paired to rapid mass spectrometry data acquisition can be used to reproducibly quantify a moderate set of analytes. High-throughput profiling at a limited depth is becoming an increasingly utilized strategy for tackling large sample sets but the time spent on loading the sample, flushing the column(s), and re-equilibrating the system reduces the ratio of meaningful data acquired to total operation time and IU. The dual-trap single-column configuration (DTSC) presented here maximizes IU in rapid analysis (15 min per sample) of blood and cell lysates by parallelizing trap column cleaning and sample loading and desalting with the analysis of the previous sample. We achieved 90% IU in low microflow (9.5 µL/min) analysis of blood while reproducibly quantifying 300-400 proteins and over 6000 precursor ions. The same IU was achieved for cell lysates and over 4000 proteins (3000 at CV below 20%) and 40,000 precursor ions were quantified at a rate of 15 min/sample. Thus, DTSC enables high-throughput epidemiological blood-based biomarker cohort studies and cell-based perturbation screening.


Asunto(s)
Proteoma , Proteómica , Biomarcadores , Cromatografía Liquida/métodos , Humanos , Espectrometría de Masas/métodos , Proteoma/análisis , Proteómica/métodos
18.
ArXiv ; 2022 Apr 26.
Artículo en Inglés | MEDLINE | ID: mdl-35547240

RESUMEN

The COVID-19 pandemic has presented many challenges that have spurred biotechnological research to address specific problems. Diagnostics is one area where biotechnology has been critical. Diagnostic tests play a vital role in managing a viral threat by facilitating the detection of infected and/or recovered individuals. From the perspective of what information is provided, these tests fall into two major categories, molecular and serological. Molecular diagnostic techniques assay whether a virus is present in a biological sample, thus making it possible to identify individuals who are currently infected. Additionally, when the immune system is exposed to a virus, it responds by producing antibodies specific to the virus. Serological tests make it possible to identify individuals who have mounted an immune response to a virus of interest and therefore facilitate the identification of individuals who have previously encountered the virus. These two categories of tests provide different perspectives valuable to understanding the spread of SARS-CoV-2. Within these categories, different biotechnological approaches offer specific advantages and disadvantages. Here we review the categories of tests developed for the detection of the SARS-CoV-2 virus or antibodies against SARS-CoV-2 and discuss the role of diagnostics in the COVID-19 pandemic.

19.
PLoS Comput Biol ; 18(1): e1009736, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-35089914

RESUMEN

Machine learning with multi-layered artificial neural networks, also known as "deep learning," is effective for making biological predictions. However, model interpretation is challenging, especially for sequential input data used with recurrent neural network architectures. Here, we introduce a framework called "Positional SHAP" (PoSHAP) to interpret models trained from biological sequences by utilizing SHapely Additive exPlanations (SHAP) to generate positional model interpretations. We demonstrate this using three long short-term memory (LSTM) regression models that predict peptide properties, including binding affinity to major histocompatibility complexes (MHC), and collisional cross section (CCS) measured by ion mobility spectrometry. Interpretation of these models with PoSHAP reproduced MHC class I (rhesus macaque Mamu-A1*001 and human A*11:01) peptide binding motifs, reflected known properties of peptide CCS, and provided new insights into interpositional dependencies of amino acid interactions. PoSHAP should have widespread utility for interpreting a variety of models trained from biological sequences.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Profundo , Modelos Biológicos , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Animales , Sitios de Unión , Humanos , Macaca mulatta , Péptidos/química , Péptidos/metabolismo
20.
Anal Chem ; 94(4): 1965-1973, 2022 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-35044165

RESUMEN

While much effort has been placed on comprehensive quantitative proteome analysis, certain applications demand the measurement of only a few target proteins from complex systems. Traditional approaches to targeted proteomics rely on nanoliquid chromatography (nLC) and targeted mass spectrometry (MS) methods, e.g., parallel reaction monitoring (PRM). However, the time requirement for nLC can limit the throughput of targeted proteomics. To achieve rapid and high-throughput targeted methods, here we show that nLC separations can be eliminated and replaced with direct infusion shotgun proteome analysis (DISPA) using high-field asymmetric waveform ion mobility spectrometry (FAIMS) with PRM. We demonstrate the application of DISPA-PRM for rapid targeted quantification of bacterial enzymes utilized in the production of biofuels by monitoring temporal expression in 72 metabolically engineered bacterial cultures in less than 2.5 h, with a measured dynamic range >1200-fold. We conclude that DISPA-PRM presents a valuable innovative tool with results comparable to nLC-MS/MS, enabling fast and rapid detection of targeted proteins in complex mixtures.


Asunto(s)
Proteoma , Espectrometría de Masas en Tándem , Espectrometría de Movilidad Iónica , Proteoma/análisis , Proteómica/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...